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Abstract 

Using the Logit quantal response form as tlie response function in eacli step, the original definition of static quantal 
response equilibrium (QRE) is extended into an iterative evolution process. QREs remain as the fixed points of the dynamic 
process. However, depending on whether such fixed points are the long-term solutions of the dynamic process, they can be 
classified into stable (SQREs) and unstable (USQREs) equilibriums. This extension resembles the extension from static Nash 
equilibriums (NEs) to evolutionary stable solutions in the framework of evolutionary game theory. The relation between 
SQREs and other solution concepts of games, including NEs and QREs, is discussed. Using experimental data from other 
published papers, we perform a preliminary comparison between SQREs, NEs, QREs and the observed behavioral outcomes 
of those experiments. For certain games, we determine that SQREs have better predictive power than QREs and NEs. 
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Introduction 

Game theory has become a powerful and popular tool in many 
sociological studies. Although several studies have questioned 
predictive power of the Nash equilibrium (NE) [1,2], it has been 
used as a primary game solution since its initial proposition [3,4] . 
However, the questions of finding such NEs and refining them 
when multiple NEs exist are not easy tasks [5,6]. Furthermore, 
people are interested to know how, in experiments or real-life 
observations, one "preferred" NE emerges from all possible 
strategy profiles, particularly in a population that does not begin 
with a NE as the initial strategic state. This phenomenon is the 
well-known question of learning in games and the converging 
towards particular solutions [7,8]. 

To find game solutions with predictive power, in addition to first 
searching for all NEs and then refining them [9], dynamic 
processes have been proposed to describe, mimic or reproduce to a 
certain extent the strategic thinking processes of game players in 
the hope that certain long-term solutions of the dynamic processes 
wiU lead to the "preferred" NEs [6,10]. Well-known examples of 
such dynamic process include replicator dynamics [11-13], Logit 
learning [14], and fictitious play [15,16]. In certain cases, a refined 
NE fits the experimental data well. We refer to such an NE as the 
preferred NE. In this case, a proposed evolutionary model is a 
good theory if the model predicts that long-term solutions of the 
corresponding dynamic processes converge to the refined NE. 
Alternatively, in other cases, no NE can explain the observed 
behavior in real experiments. In this case, a good theory means 
that long-term solutions of the proposed dynamic processes can 



explain the observed behavior instead of the NEs [17]. To simplify 
our terminology, we denote both the NE and the long-term 
solution in these cases, where they are capable of describing 
experimental or real-life observations, the preferred NE. The 
primary goal of these typical dynamic processes, and thus of all of 
these theories, is to determine the preferred NE by solving for the 
long-term solutions of the proposed dynamic processes. For a 
dynamic process, usually two central topics should be discussed: 
how well experimental observations can be explained by long-term 
solutions of the dynamic process and the relation between the 
dynamic process's long-term solutions and other solution concepts 
such as NEs and refined NEs. 

In this manuscript, we study properties of a new dynamic 
process: the iterative Logit quantal response dynamics (ILQRD), 
which wiU be defined based on the concept of static Quantal 
Response Equilibriums (QREs). Our goal of proposing this new 
dynamic process is solely to capture the preferred NE with long- 
term stable solutions of ILQRD, which we denote as stable QREs 
(SQREs). 

This manuscript is organized as follows. In this introduction, we 
first explain our main idea: the evolutionary process. In the next 
section, we define several notations and the dynamic process. 
There, we also compare the new dynamic process with other 
learning models and evolutionary processes in game theory. In the 
rest parts of this manuscript, we attempt to discuss the two 
previously introduced central topics of this ILQRD process: how 
the long-term solutions fit experimental results and what is the 
relation between its long-term solution and other solution 
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concepts. Next, we illustrate the performance of our dynamic 
process using examples and provide an analytical proof of the 
major conclusions for the special cases of 2 x2 symmetrical games. 
Then, we compare our SQRE with a highly similar solution 
concept: the quanta! response stable solutions (QRSS). After that, 
using SQRE, we re-analyze some collected experimental results. 
Finally, we summarize our main conclusions and discuss possible 
future research. 



For 2 x2 games, the payoff matrix G indeed appears as a matrix. 
However, for a general NxM game, the matrix becomes a map 
from N vectors to TZ, i.e., cubic tensors for 3-player games and 
T{N, 0)-type tensors for Af-player games. The payoff is no longer a 

matrbc. 

To unify the notation of all xM games, we introduce a new 
equivalent set of notations as follows. We write payoff matrices and 
mixed strategies as matrices: 



Notations and Definitions 

To present our formula in a compact form and to remain 
consistent with the notations and the terminology of statistical 
ensembles used in statistical physics, in this section, we introduce a 
matrix-based notation to represent a general NxM game. The key 
notation that difiFers from the conventional mathematical forms of 
game theory is the matrix representation of probability distribu- 
tions and the payoff matrices of general A'-player games. One may 
proceed direcdy to Eq. (18) and Eq. (19) and continue from there if 
learning these new notation presents an obstacle. Most of our 
expressions can be understood in terms of the conventional 
mathematics of game theory. However, we believe that they can 
be understood more conveniently using the new notation. 
Furthermore, the new notation is readily applicable to quantum 
games [18]. 

A new set of matrix-based notations 

Here, we introduce a matrix-based notation for probability 
distributions such that the probability distribution of the strategic 
status of all players and the mathematical description of payoffs for 
games with an arbitrary number of players and an arbitrary 
number of strategies become matrices. However, to understand 
our work in this manuscript, this new set of notation is not 
necessary. One may skip this section and proceed directly to Eq. 
(18) and Eq. (19) . 

Consider a 2x2 game with the following conventional form of 
payoff bi-matrix: 



G 



1,2 . 



a,a' h,c' 
c,h' d,d' 



(1) 



with the convention that the row (column) strategies belong to the 
first (second) player and the first (second) number of all entries 
represents the payoff received by the first (second) player. We 
denote the first (second) player's strategy C, D (also C, D, although 
one player's strategies can be totally diflferent from the other 
player's strategies). Usually, mixed strategies, which include pure 
strategies as special cases, are written as column vectors: For 
example. 



for player 1 and 



P' = [p\l-p'] 



p' = [pM-p'] 



(2) 



(3) 



for player 2. The payoff is calculated from the following vector- 
matrix-vector multiplication. 



(4) 



and 



1,2. 



a,a' 
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0 


0 ■ 
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b,c' 


0 


0 


0 
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c,b' 


0 


0 


0 


0 


d,d' _ 


,p' 




0 




0 


(1- 







(5) 



(6) 



Then we calculate the payoff from the following trace operation: 



£i'2 = rr(//i'2pi(x)p-). (7) 



One can confirm that with the same strategy profiles, Eq. (7) 
and Eq. (4) result in the same payoffs. For the special case of the 
2x2 game, in the new formalism, the payoff matrices and the 
matrices of strategy state of all players are of dimension 2^. The 
probability distribution of both players is defined as a direct 
product of each player's state matrix: 



(8) 



Every entry of this state matrix corresponds to the probability of 
all of the players choosing the corresponding strategic combination 
defined by the position of the entry. For example, the (1, 1) (upper- 
left) entry of p is p^p^ and means that at this probability, the two 
players take the strategic combination (C, C). In turn, the 
correspondence of this entry in the payoff matrices H^'^ - their 
(1,1) entries - are naturally (a,a'). In this sense, from the general 
expression. 



e = tr{H'p'®p')=H{p\p^). 



(9) 



//' can be interpreted as a linear map from the set of { (p' } 
to real number TZ. 

Another usefiil feature of this notation system is that it 
streamlines the description of correlated strategies. That is, this 
notation also functions when For an NxM game, s'j 

stands for the Ith strategy of the rth player, where (e[l,A^] and 
/e[l,A/]. The set of strategies of player i is denoted as 
5' = {.V[,.y2, ■■■ ,.?'),/}• For convenience, we denote the set of 
probability distributions over S' as A', i.e., A' = {p'} and the 
direct product set of all of these probabiUty distributions as A, i.e., 
A = a' ® A^. This A differs from the set of probability distributions 
over 5 = 5''(x)S', which we denote as A{S). This space includes 
the correlated strategy, whereas A is the set of only independent 



PLOS ONE I www.plosone.org 



2 



August 2014 I Volume 9 | Issue 8 | el 05391 



Stability of QREs 



strategies. For example, in this notation, a general possibly 
correlated equilibrium [19] can be defined as pJ^eA(5') such that 
for every player i, Vp'eA', 

tr{H'pl^)>tr{H<p'®tr<{p'^)) (10) 

where ty'{p\'e) ^ partial trace, which performs the partial 
integral/ summation over player i'% strategy space. For example, 

tr\p'')=Y.P'\'>'-) (11) 

and the result is a strategy profile of player 2. This partial trace is 
the same as the partial summation in deriving partial distribution 
in probability theory. In our notation, NE is defined as p\e®p^ 
or, equivalently, p^^£A such that for ^i^p', 

tr{H'pl^pl)>tr{H'p'®tr^{pl®pl)). (12) 



Definition of iterative Logit quanta! response dynamics 
(ILQRD) and its stable equilibriums 

Using the previously described matrix-based notation, for a 2- 
player game our iterative Logit quanta! response dynamics 
(ILQRD) is defined as follows: 

P'('+^) = ^/"'^^''''^' (13) 



g/S(aV'«+ !)+*'(! 1))) (19) 
^P{a'pHt+l) + b'(l-pHt+l))) _^^li{c'pHt+l)+d'(l-pHt+l))) 

To simplify our notation, we denote the RHS of Eq. (18) as 
g(p; l}\a,b,c,d^, where a, b, c and d can be omitted when it is clear 
what the parameter a, b, c and d refers to. Formally, we denote this 

map as 

p\t+ 1) =^(p\t)) =g{g{p\t); I}\a,b,c,d)-J\a',b',c',d') (20) 

and|)'(/+l) can be regarded as an intermediate variable. This map 
is an iterative map from a mixed strategy (p^{t)) to a new mixed 
strategy (p^(t+\)). 

The fixed points of this ILQRD are the same as the static Logit 
QRE, and they are denoted as P^iP) = {p\iP),pl{P)) ■ If a fixed 
point of thos(^ QREs is also the long-term evolution of ILQRD, 
this fixed point is referred to as a stable QRE (SQRE) and denoted 
as Poo(jS) = (plo{P)iPloiP))- Otherwise, it will be referred to as an 
unstable QRE (USQRE). Next, we focus on the relations among 
pure-strategy NEs, mixed NEs, QREs and SQREs and experi- 
mental data. A SQRE must be a QRE. However, the inverse is 
not necessary true. This stability test potentially differentiates 
QREs into SQREs and USQREs. In principle, such differentia- 
tion can improve the predictive power of QREs for experimental 
data and examining/ demonstrating this is the whole point of the 
present manuscript. 



„2/,_Lr^_ 1 _/4(pl(<+i)) 



zlit+i) 



where the reduced payoff matrix is defined as 
Hi{p')=H\;p% 

hUp^)=h\p\-). 

Normalization constant is defined as 



(14) 

(15) 
(16) 



(17) 



Using the matrix-based notations, one can straightforwardly 
extend this ILQRD to general NxM games. 

Using the usual notation of probability distribution, for the 2x2 
game defined in Eq. (1) , Eq. (2) and Eq. (3) , we can rewrite the 
iteration process explicidy as follows: 



P\t+1)- 



gii{apht)+ba-pht))) 



g/!(ap2(,)+i,(i _p2(())) _^ ^j!{cp\t)+d(\ -p\i))) 



. (18) 



Difference between our evolutionary process and other 

learning/imitating models 

In ILQRD, a key concept is the use of the quantal response 
function (as in Eq. (13) and Eq. (14)) to determine a player's 
strategy profile according to the player's corresponding payoffs. 
The introduction of parameter J? as a description of bounded 
rationality in this form of quantal response function is common in 
theories of learning in games [20] and in the QRE concept [21]. 
Additionally, this idea may be proposed simply from the viewpoint 
of statistical physics [18]. The use of parameter /? can be justified 
to a certain degree based on games with limited information 
[22,23]. 

In fact, the same expression used in Eq. (13) and Eq. (14) has 
been used in discussions of Logit QRE, and a highly similar 
expression has been used in Logit learning [14,24], stochastic 
fictitious play [15] and stochastic reinforcement learning [25]. In 
these theories, the quantal response function, as in Eq. (13) and 
Eq. (14) , is occasionally referred to as the smoothed best response. 
However, all of these theories differ from ours in principle, as 
explained bellow. 

First, we compare our expressions with the QRE. In the QRE, 



(21) 



is a map from all players' strategy profile p = Il®p' to itself This 
equation is a fixed-point equation. Our work differs from the QRE 

in that a QRE only focuses on fixed points solved from static 
equations, whereas we use iterations to find the stable frxed points 
and distinguish them from other, unstable fixed points. Later, we 
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will note that such a difference in stability is essential in applying 
the QRE to explaining experiments. 

The QRE has been compared with experimental observation 
and generally provides a better fit to the data than the NE [26]. 
However, the QRE has been criticized as an illusory improvement 
because there is an additional free parameter in QRE when fitting 
the curve, and one can always improve using an additional 
parameter. We demonstrate that this free parameter is not 
completely free. In fact, in certain cases, when the /? is sufficiently 
large, the fixed points from the QRE are no longer stable. 
Therefore, when comparing experimental data with the stable/ 
unstable Q_REs and the NEs, one can determine whether the QRE 
or the NE has more predictive power and then the QRE is not 
always better than the NE. Distinguishing stable QREs from 
unstable QREs using iterative dynamics is this manuscript's first 
contribution. .4s presented below, we collected experimental data 
and conducted a preliminary comparison of the theories with 
experimental observations. After distinguishing SQREs from 
QREs, we tested SQREs, QREs and NEs against several 
experiments reported in the literature, which is this manuscript's 
second contribution. One possible further investigation along this 
Une, which has not been implemented in this work, can be 
applying our ILQRD to cross-game experiments. For example, 
using the same players in different games with similar level of 
payoffs, we can estimate the parameter j} from one game and test 
it in other games. In principle, this should be even more interesting 
then simply testing SQ_REs against experimental results. 

In a dynamic QRE [24], so-called Logit learning [14], which is 
driven by observations during real game-playing processes, in 
which each player chooses only one pure strategy to play every 
turn, the same smoothed best response function is used to mimic, 
the player's response to the opponent's pure strategy, as follows: 

If the smoothed best-response function is replaced by the real 
best-response function then this becomes simply the best response 
dynamics. Our model differs from this in that states of players can 
be mixed strategies in our model while only pure strategies in this 
model. 

In fact, this smoothed best-response function defines a 
probability transition matrix between the current strategy profiles 
of player i and the previous strategy profiles of all of the other 
players. This transition probability depends not on player j's 
previous state but on the previous states of all of the other 

players s '{I— 1). For simplicity, we express this probability as 
follows: 

M^\s-\t)) = W,s-\t))]^^^. (23) 

If we let each player take his or her turn in the natural order, we 
win have a transition matrix between the current and the previous 
strategy profiles of all of the players. For simpUcity, we denote this 
matrix as 

M = n^jMW. (24) 



In the above formula, taking = 2 as example, the matrix may 
be written according to one of the two following rules: 

M = M(^\s-\t-\))M^^\s-^{t)), (25) 

M=M^^\s-\t-\))M(^\s-^{t-V)). (26) 

The first rule is referred to as alternating updating, whereas the 
second rule is referred to as simultaneous updating. Regardless of 
the form assumed, the central task is to determine the invariant 
probability distribution using the transition matrix M: 

P,,= lim(Mn>o. (27) 

To distinguish fixed-point solutions of this transition matrix 
from a QRE, these long-term solutions are occasionally referred to 
as end results of Logit response dynamics [14]. Here, we name 
such solutions quantal response stable solutions (QRSSs). There 
have been many attempts to solve [27] or characterize [14] such a 
QRSS (P„) for a given transition matrix M. Because a QRE and a 
QRSS use similar formulae, with one allows mixed strategies while 
only pure strategies for the other, in principle, the two should be 
closely related. However, in this work, we will demonstrate that it 
is generally not the case: Q_RSSs differ substantially from QREs 
and SQREs. Differentiating a QRSS from a QRE and a SQRE is 
this manuscript's third contribution. 

A continuous-time smoothed best-response dynamic [7,28], for 
simplicity using a single-population symmetric game as an 
example, 

0' = /l(^AR'(0-')-0'), (28) 

seems to be quite similar with our model since its discrete version 
is, after taking rate of revision strategies is 1 , 

B\t+\)=BS(e-'), (29) 

Here notations in [7] is used that 0' refers to the portion of 
population being strategic state i and smoothed best-response 
function BR can be exact of the exponential form such as the one 
in Eq. (18). To this end, we argue that in this work, we focus on 
stability of fixed points of this discrete dynamic, which we have not 
seen in the literature. [28] discussed stability of the time- 
continuous counterpart. Furthermore, this stability analysis is 
linked to stability of QREs and thus in turn distinguish unstable 
Q_REs from stable QREs and this link as far as we know has not 
been investigated by others. 

Next, we focus on a comparison between ILQRD and fictitious 
play [1,5]. In fictitious play, players update their beliefs and choose 
a pure strategy to play according to certain decision-making rules 
that relate their strategy choice to their beliefs. Such decision- 
making rules typically include, for example, the best response 
myopic strategy [15] and the smoothed best response [29]. The 
latter uses the same expression as used in Eq. (21) with only one 
difference. The difference is, p ', which is the true current strategy 
profiles of the others, is replaced by player i's belief regarding the 
strategy profiles of the others, which is usually taken to be the 
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empirical distribution deduced from the entire history of other 
players' choices. In Eq. (21), Eq. (13) and Eq. (14), there is no belief 
and no empirical distribution. When we examine the learning 

process in real life, it may seem to be more reasonable to take 
player beliefs into consideration. However, as we have previously 
noted, we are substantially much concerned about finding the 
proper solutions, that is, solutions capable of predicting experi- 
mental behavioral outcomes, than making the entire dynamic 
process meaningful. Because fictitious play extracts the empirical 
distribution of other player strategy profiles from history, the speed 
of com-ergcncc occasionally becomes a problem [30,31]. As 
discussed below, in ILQ_RD, convergence speed is never an issue. 

Similar relation holds between ILQRD and stochastic rein- 
forcement learning [25] . Although the function forms of response 
in the two models are quite similar, our ILQRD relies only on the 
current state of all players while the reinforcement learning model 
take partial or all previous actions and payoffs into consideration. 
A record of scores for every potential strategy is kept by every 
player in the reinforcement learning while here in the ILQRD, 
only the previous mixed strategy is used in the decision making of 
the current strategy. 

Another widely used dynamic process to determine the 
preferred NE is repUcator dynamics [12,13]. AH such previously 
mentioned static mapping or dynamics are based on introspective 
thinking and thus differ from replicator dynamics, where each 
player plays against a finite or infinite population and individuals 
learn from simple imitation but not with individual introspective 
thinking. In this manuscript, we focus on the effects of 
introspective thinking and only of the two players but not in a 
model of population dynamics. 

We should note that the same notion of ILQRD (referred to as 
Boltzmann iteration) was proposed in a 2004 unpublished working 
paper [18] by one of the authors in a quantum game context. The 
idea is not a central topic in that working paper and was not 
developed any further there. 

Additionally, a highly similar dynamic process was proposed in 
[32], as a concept referred to as noisy rational strategies (NRS): 

p„„= Um/o<,/io...o/«(p(0)). (30) 

There, the authors focus on the effect of increasing /? 
{(!()< 1^1 < ■■■ <Pn) '^'^d assume = go so that p(0) becomes 
irrelevant. According to the authors, such an increase in fi can be 
interpreted as the increasing difficulty of performing a greater 
number of iterations given a player's finite capability [32]. In this 
sense, what we discuss in this paper bears a greater resemblance to 
the following: 

p%,, = lim /"o . . . o^feo/i (p2(0)), (31) 

with a constant ji, i.e., Pi = fi. We do not assume diat fi is 
increasing or decreasing, or that jS^q = oo or pi = ca. We do not 
believe that it is more difficult to perform more iterations because 
all iteration processes are supposed to be fictitious. We will 
demonstrate that essentially none of the desired features of SQRE 
relies on details concerning orders of P or increasing or decreasing 
values of j5, instead depending on iteration. The iteration alone 
suffices to lead us to the preferred NE. Furthermore, we have not 
found a thorough discussion of the stability of NRS solutions. 
Thus, this manuscript can be regarded as a further development of 
the NRS in that it distinguishes stable from unstable solutions and 



notes that the key component is the iteration and not the order of 
j8 or the assumption of the limit of p„ approaching oo. 

In sum, the proposed iterative process differs from many other 
theories in that it is a map from a mixed strategy of all of the 
players to a new mixed strategy of all of the players. One might 
have some questions with respect to the interpretation of this 
process and comparison with real game playing. However, in 
physics, it is natural to study the evolution of distribution functions 
directly instead of the evolution of individual trajectories. 
Additionally, we do not aim at making the process reflect to 
realit}' more closely, but only to make the long-term solution more 
capable of predicting the game outcomes. Next, we discuss certain 
features of the proposed iterative process and compare its solutions 
to observed behavioral outcomes of experiments. 

Major features of ILQRD, illustrated through 
examples 

In this section, first, we demonstrate by example that the QRE 
covers all NEs, including pure and mixed NEs. This conclusion is 
not new and has been implicitly demonstrated in [26]. For a 2x2 
game, this statement can be proved by considering Eq. (18) and 
Eq. (19) in the extreme case of jS— »oo. We present a general proof 
Second, we demonstrate that once there is a preferred pure NE in 
a game, our SQRE converges to this focal NE. This phenomenon 
serves as a natural refinement. Unfortunately, we have not proved 
this conclusion mathematically. Thus, we illustrate this outcome 
by examples. Third, we demonstrate that all QREs that 
correspond to mixed NEs are not stable for the case of large p 
(;8-> oo) but that some of these QREs can be stable for finite p. 
The third conclusion first questions the predictive power of QREs 
(when they correspond to mixed NEs) and then redeems the QRE 
as a possibly apphcable solution when bounded rationality is 
considered. This conclusion also distinguishes stable Q_REs from 
unstable QREs, which enables examination of the appUcability of 
QREs to the explanation of experimental observations or real-life 
phenomena. That is, in principle, it is no longer true that QREs 
are strictly Ix-ttc-r than NEs. If the experimental data of a game are 
located in the region of unstable QREs, the QRE is not a practical 
solution concept for the game because unstable solutions are not 
reachable following the evolution. As far as we know, the last two 
conclusions (first, that SQREs converge to preferred NEs when 
there are such NEs and thus it represent a natural refinement of 
NEs and, second, QREs that correspond to mixed NEs become 
unstable for large enough p) are new. This also implies that mixed 
NE are not directiy apphcable, since they are in a sense always 
unstable in the limit of large p (also according to best-response 
dynamics), unless these mixed NEs are close to SQREs. We 
believe this also improves our understanding of mixed NEs. 

Games with two pure NEs and a mixed NE: Coordination 
Game and Hawl<-Dove Game 

It can be demonstrated that for games with a dominant strategy, 
such as the prisoner's dilemma, the dominant strategy is such that 
one of the QREs and the SQRE converge toward the dominant 
strategy in the large P limit. However, this case is trivial. We 
demonstrate behaviors of our ILQRD starting from the more 
interesting coordination game, which does not have any dominant 
strategies. The payoff matrices of coordination game are as 
follows: 
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Q. 




1 

P 



Figure 1 . The iterative mapping for various /'and tKie line of p' = on tKie coordination game. For a given value of /J, the intersections of 
corresponding curve and the line of p^=p^ are the QREs. For this game, there is only one QRE for small values of /?, whereas there are three 
QREs(Qfifoo/ QRfii and QREmhe) for larger (i. Here, Qfifoo and QREn are always stable, whereas QREmne, which is close to =0.83, is always unstable. 
As an example, we illustrate the first two steps of the iterative process for a specific value of /j = 3.1. 
doi:1 0.1 371 /journal.pone.01 05391 .gOOl 



G'--= . 32 

[0,0 5,5j ^ ' 

It is known that there are two pure-strategy NEs and a mixed 
NE. They are (p\p'^) = (0, 0), (I, I), (0.83, 0.83), which are denoted 
respectively as PNEoo, PNEn and MNE. Conventionally, the 
preferred NE — PNEqq — of this game can be found to be the 
focal NE through refinement [5,6] . Evolutionary stability analysis 
[1 1] indicates that the mixed NE is unstable. That is, when 
/)q</'', =0.83 (Pq>pI), the population converges to PNEqo 
(PNEii). Because the P we introduced has no absolute meaning, 
in all of the manuscript's remaining calculations, we normalize 
each player's payoffs by their own maximum. For the payoffs 
provided in Eq. (32) , the maximums are 5 and 5 for the first and 
second player, respectively. 

Fig. 1 shows iterative mappings for a range of values of /J of this 
coordination game. Each curve except the red curve (which is 

—p^) is a curve of the iterative mapping for a given value of /J. 
The long arrow labelled /? f indicates the shift of the curve of the 
iterative mapping when ji increases. As an example, we also 



illustrate the first two steps of the iterative process for a specific 
value of /? = 3. 1 . Usually it takes only less than 20 iterations to find 
the SQREs with reasonable accuracy starting from any initial 
value of ^ . We can observe that for small ji, there is only one 
QRE, which corresponds to PNEoo (denoted as QREoo). For large 
P, there are multiple QREs: QREqq, a QRE that corresponds to 
PNE,, (denoted as QREn) and a QRE that corresponds to MNE 
[QREf^j^i?). However, not all of these QREs are equally good. One 
can observe that for this game QRE qq and QREn are always 
stable, whereas QREi^jf^j? is always unstable. Here, stability means 
that if the initial guess is not correct at the QRE, one iteration step 
will drive the value of^ closer to the QRE. Those QREs that are 
stable in this sense are referred to as SQRE. To simplify our 
terminology, we wiU refer to the QREs that correspond to pure 
(mixed) NEs in the limit of oo as pure (mixed) QREs, although 
all QREs for all finite p are in fact mixed strategies. The same 
naming scheme and the same notation are used for SQREs, for 
instance, pure SQREqq, pure SQRE^ and mixed SQREjuj^e. 

In principle, all of the information on the QREs and SQREs of 
this game is included in Fig. 1. However, to better illustrate the 
stability of QREs, in Fig. 2, we plot the dependence on p of the 
stability of the QREs of the coordination game. From the lower 
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Figure 2. Dependence on /"of SQREs of the coordination game. When /?</J<:. which here is approximately 22, there is only one QRE(QRFoo) 
and it is a SQRE. When /i>/ic, there are three QREs. However, the QREmne is unstable. For initial values ofy;„ less than the of QREmne (the green line), 
the iteration results in the QRfoo (the pink square; thus, SQREqq in this case). Otherwise, the QREu becomes the long-term solution of the iteration 
SQREii (the gold circle). The corresponding (not shown) can be straightforwardly calculated. 
doi:1 0.1 371/journal.pone.01 05391. g002 



left section, where Q/i-Eoo overlaps with SQREqq, we can observe 
that for small fS {P<P^, which here is approximately 22), there is 
only one QRE, and it is a SQRE. For all initial values oip , this 
SQRE is the only long-term state. When there are three 

QREs: QREoo, QREn and QREmne- However, QREmne is 
unstable. For initial values of less than the unstable QRE (the 
green line, QREmne), the iteration results in S()i?£oo(the filled 
circles). Otherwise SQREii{\h& empty circles) becomes the 
iteration's long-term solution. The corresponding (not shown 
in the figure) can be straightforwardly calculated using Eq. (19). 
Note from the upper right section that the region between SQREn 
(the empty circles) and QREmne (the green line) is narrow 
compared with the space between SQREqq (the frUed circles) and 
QREmne (the green line). This outcome indicates that for a wide 
range of initial value of p ' the SQRE of this game converges 
toward SQREqq, which is also PNEqq, the preferred NE in this 
game. This picture, particularly the right half of Fig. 2, is highly 
similar to results from evolutionary stability analysis. However, the 
behavior with small fS such that no matter what the initial value of 
p is the SQRE is always the QRE that corresponds to PNEqq, is a 



unique result of our ILQRD. We believe that this unique SQRE 
for small can be regarded as a refinement of NEs. 

In this game, we observe that QREs cover all NEs and that for a 
wide range of initial values of p , SQREqq is the SQRE, and it 
corresponds to the preferred NE (^' = 0). Particularly when fi<Pc 
for all values of Pg, the preferred NE is the only SQRE. It is 
intuitive to expect that even for large /? the green line (QREmne) is 
closer to the empty circles {QREjj) than the fdled circles (QREqq). 
Thus, for a wide range of p^, SQREqq wiH be the long-term 
solution of the iterative process. In this sense, ILQRD and its 
SQRE represent a natural refinement for QREs and NEs. 
Additionally, we note that for this game, overall, our prediction 
is somewhat similar to that of evolutionary stability analysis. Next, 
we discuss another game, in which our prediction differs more 
significantly from the results of evolutionary stability analysis than 
the situation in this Coordination Game. 

Now, we consider the hawk-dove game, which according to 
evolutionary game analysis [1 1] has one evolutionary stable mixed 
NE and two evolutionary unstable pure NEs. Its payoff matrices 
are defined as follows: 
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Figure 3. QREs and SQRE of the hawk-dove Game. When li<Pa which is approximately 1 0 for this game, starting from an arbitrary initial value 
of p\ the long-term solution of the ILQRD of the hawk-dove Game is SQREmne- When P>[Sa there are three QREs: QRfoi. QRfio and QREmne- However, 
here, QRE^we is unstable. In this case, the SQRE depends on the initial values of p': When it is above of SORf^wE (the green line), it is SQRE-,(, (pink 
square). Qtherwise, it is SQftfoi (the gold circle). 
doi:1 0.1 371 /journal.pone.01 05391 .g003 



0,0 7,2 
2,7 6,6 



(33) 



Based on a calculation of iterative mappings similar to that 
shown in Fig. 1 , the QREs and SQREs of the hawk-dove game for 
various values of ;8 are plotted in Fig. 3. We observe that for small 
P, there is only one QRE{QREmne) and it is a SQRE. This SQRE 
is the long-term solution of the ILQRD for all of the initial values 
of For large P, there are three QREs: QREqi, QREiq and 
QREmne- However, in this case, QREmne is unstable. The long- 
term solution of the ILQRD depends on the initial values ot . 
When the initial values is above /»' of QREh-h^te (the green line), 
QREio is the SQRE. Otherwise, QREoi is the SQRE. This figure 
provides more information than the QREs and NEs by 
distinguishing stable QREs from unstable ones. Additionally, this 
figure differs from the results of the evolutionary stability analysis, 
which demonstrates that the mixed NE of this game is 
evolutionary stable. Our results suggest that it is stable only when 
P<Pr, which is approximately 10 for this game. Otherwise, the 



outcome of this game will prefer to be QREqi or QREiq. This 
outcome differs from the results of evolutionary game analysis of 
the same game. For small p, the QREmne is the only SQRE, 
whereas for large P, the mixed QRE becomes unstable: depending 
on the initial value of p , different SQREs can be reached. 

Games with a unique mixed NE: Tennis Game 

The third example is the tennis game [33], which has one mixed 
NE (p , p') = (0.7, 0.6) but no pure NEs. The payoff matrices are 
given as follows. 



5,5 8,2 
9,1 2,8 



(34) 



Both the QRE and SQRE are shown in Fig. 4. We find that for 
this game there is always one and only one QRE for a given value 
of P and that this QRE converges toward the mixed NE (0.7, 0.6) 
in the limit of jS^co. Therefore, this QRE is QREmne- However, 
the SQRE follows this QREmne only when P is small enough (jSs 
P,:, which for this game is approximately 3.7). When P>P,, this 
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Figure 4. NEs, QRE and SQRE of the tennis game. From (a), we observe that there is only one QRE (QREmne- the green line) for each given value 
of p. There is one SQRE (SQREmne- the green circle) when /i<jSc; which is approximately 3.7 for this game, and there is no SQRE for IS>l}c- (b) shows the 
plots of and and their relation with fl. The green curve of triangles represents values of (p\ p^) when such divergent iterations are performed 
1000 times, i.e. those are the unstable run-away points. We observe again that when [S>[Sc, QREmne can not be reached by any SQREs, although it 
converges towards the NE. 
doi:1 0.1 371 /journal.pone.01 05391 .g004 



QREf^ff^E becomes unstable; thus, there is no longer any SQRE. 
This game displays a substantial difference between the SQRE 
and the mixed NEs. Such games can be used to test the 
applicability of SQRE relative to NE, as shown in the following 
section. 

From these examples, we observed the following: first, QREs 
exist for all of the games discussed above, and QREs cover all of 
the NEs in the limit of >oo; second, for games with a preferred 
NE, the SQRE can be used as a refinement of the NEs; and 
finally, mixed QREs become unstable for large enough /J values; 
thus, the SQRE can be regarded as a refinement of the QREs (and 
therefore the NEs). These observations reflect the three main 
features of our ILQRD. As we pointed earlier, the first feature was 
implicitly demonstrated in [26]. However, the latter two features 
are new. For certain games, the distance between the SQRE and 
the mixed NE is large. Thus, experimental results on such games 
are suitable for testing the apphcabUity of the SQRE relative to the 
NE. Before we proceed to a comparison of our theoretical 
prediction with experimental results, we prove the previously 
mentioned first and the third features of our ILQRD. Unfortu- 
nately, we cannot prove the second feature because at present we 
do not know the necessary and sufficient condition for a game to 
have a preferred NE. Instead, we simply desire to demonstrate that 
for the coordination game and the hawk-dove game, for small ;8, 
the SQRE corresponds to a certain refinement: SQREqq for the 
former game [5,6] and SQREmne for the latter [11]. 

Proof of main conclusions on 2x2 symmetric 
games 

In this section, while considering a symmetric game for 
simplicity, we wish to prove the previously described three features 
using examples. 

Let a' = a,h' = b,c' = c,d' = d in Eq. (13) and Eq. (14). If we 
merge the two equation and focus only on p^, the iteration 
function becomes 



(c-d-a + b) , ! , ^ +d-h 

" ^^Jl[{c-d-a + b)pHt)+d-h] 



(35) 



1 + e 



First, there are five possible NEs: the pure NEs (0, 0), (1, 0), (0 
1) and a mixed NE 



K^c—d — a + h'c — d — a + bJ 
depending on the values of a, b, c and d. Let us first demonstrate 
that the QREs cover all of the NEs under proper conditional 
relations among a, h, c and d. That is Eq. (35) has five possible 
solutions: QREoom, QflEM, QREoM QREuifi) and QREm. 
ne{P), which corresponds to the previously mentioned five NEs in 
the limit of /?— >oo. In terms of these notations, we wish to 
demonstrate that 



QREooiff) 

QREooiP) & QREniP) 
QREioiP) & QREoiiP) 
QREuiP) 
QREmne(P) 



if b- 
if b- 
if b- 
if b- 

if 0< 



d<0 & c- 

d<0 k c- 

d>0 k c- 

d>0 k c- 
b-d 



b-d- 



-a>0 
-a<0 
-a>0. 
-a<0 

-<1 



(36) 



We will prove that the first and the last cases and the extensions 
to other cases are trivial. 

It is straightforward to demonstrate that when h—d<Q & c—a> 

0, 
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Figure 5. For ail of the games discussed above and below /?f , which is the /^^^ found in iterative calculation, is plotted against pj, the 
which is solved. 
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/(0;/;)>0 and lim/(0;/!): 



= 0 and 0< lim ^^^^-^<1. (37) 

IS^ao dp 



These three conditions mean that/(0; p) is always greater than 
0. However, it approaches more closely to 0 when ji increases. 
Additionally, when /? is sufficiently large, /(^; /?) increases near 
p = 0. However, it increases slower than p . This statement is 
equivalent to stating that /(O; j8) — 0>0, and there is a ^ such that 
f{p; ji) —p<0 when P is sufficiently large. Thus, there must be a 
jf)*(/J), such that /(/?*; p)-p* = Q. In the limit of /?^oo such 
p'ifi) = 0. The situation for p^ can be analysed similarly. 

b-d 

For QREmne, where 0 <»''=- ; <1 is a proper 

h—d+c—a 

mixed strategy, we first want to demonstrate that 



If we can prove this point, then, first, jf) is a fixed point in the 
limit of jS->oo. Second, this fixed point is not a maximum or 
minimum of the function f(p; P)~p- The latter means that the 
curve f(p; P)~p passes across 0 when fi is sufficiently large and 0 is 
not an extremum. 

Because we are working with symmetric games, the iteration 
function f{p; p) can be regarded as a composite mapping of the 
following fvmction: 



np-J)=g{g{p-J);P)=g^g 



where 



g{p;P)- 



1 



(39) 



(40) 



\im fip'-p)=p* and lim^^^ 



-1#0. 



(38) 
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In terms of this function, we can then easily demonstrate that 



lim g(p'';P)=p* and lim- 
;!->oo ^-►oo op 



■1#0. 



(41) 



Stable solutions are a subset of fixed points. For the iterative 
mapping defined in Eq. (35) , we use linear stability analysis [34], 
according to which a solution of Eq. (35) is stable if the Jacobian 
(simply a derivative in this case) of the right-hand side is less than 
1, i.e.. 



Here, we discuss the first part in greater detail, whereas the 
remainder is straightforward. The fixed point of the mapping g is 
defined equivalentiy as follows: 



c—d—a+b 



(42) 



¥or p{l} that is close enough to p such that 0<p(t)<l, the limit 
of the RHS of Eq. (42) when /?->oo becomes exactly p . 

Up to this point, we have demonstrated that the QREs of Eq. 
(35) cover all of the possible pure strategy NEs and the mixed NEs. 
Next, we discuss their stability. We wish to demonstrate that any 
pure QREs if exist are always SQREs and that mixed QREs are 
SQREs only for small /] but unstable for large f). Additionally, we 
attempt to define Pc, the critical value of /J. 



'1 
dp 



= Wic-d-a + b)f{l -p*mf{p'mf<h (43) 



where p*{j}) is defined as in 

P*=f{p*\P)- 



(44) 



For pure QREs, for example, QREqq, p*{P) decreases to 0 
1 



exponentially as - 



T^TT. Therefore, Pp*{P)^0. Thus, Eq. (43) 



l + el}{d-hy 

is always satisfied. Other pure Q_REs can similarly be shown to be 
stable. Thus, all of them are SQREs. 

For mixed QREs, first consider the case of jS = 0. In this case, 

the payoff makes no difference, therefore, p*{p = Q) = -, and thus 



0.30 




0 2 4 6 8 10 

/3 



Figure 6. Correlation index Cobtained from QRSS plotted against j& The correlation index of the QRSS is always non-zero except, again, in 

cases of extremely small (S. 

doi:l 0.1 371/journal.pone.01 05391 .g006 
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dp 



= 0<1, 



(45) 



Therefore, mixed QREs are SQREs in the limit of >0. Now, 
consider the case of /?— >oo. Because 0 </>*(/?)<! is a finite 
number, there is always a p,. such that when P>lic, Eq. (43) is no 
longer valid. which is tlie critical value of jS, is defined as 
follows: 



[He 



(46) 



For a given game with fixed values of a, h, c and d, the 
numerical value of can be solved numerically using Eq. (44) and 
Eq. (46) together. In Fig. 5, we plot all jS^ found in numerical 
simulation of ILQRD (denoted as Pf) against the pc solved from 
Eq. (44) and Eq. (46) (denoted as P^). We found that the two 
values agree with each other well. 

For asymmetric games, similar p^ can be derived. For simplicity, let 
us defme C{p*,P,\a,b,c,d)=P^(c-d-a + b)(\-p*{P^))p*[P^). 
Then, the critical value of P is determined by 



\C{p^'\p,;a,b,c,d) C{p^-*,P,; a',b',c',d') | = 1, 



--gip^'*;Pc\'',b,c,(> 



--g{p^'*\Pc\a\b',c',d'). 



(47) 



(48) 



(49) 



for all the examples used in die previous section and also those payoff 
matrices of the experiments discussed in the next section, we plot Pc, 
which are numerically solved from these equations and the 
corresponding ones found from simulations. We found that values 
of pc numerically solved and values found from simulations are in 
very good agreement. 

We have demonstrated that for symmetric 2x2 games, our 
ILQRD has the following characteristics: (1) the QREs cover all of 
the pure and mixed NEs, and (2) pure QREs are SQREs, whereas 
mixed QREs are SQREs for small /? but unstable for sufficiently 



Table 1. Payoff matrix of ten game samples. 



large p. These conclusions cover all the major features that we 
demonstrated in the preceding section using specific examples. A 
general proof for general 2x2 games should be straightforward, 
although it would involve more tedious algebra. 

From this general proof, we have observe that for all of the 
values of/?, the QRE exists but not the SQRE. Intuitively, when P 
increases, the Q_REs approach NEs because our iteration process 
moves closer to the best response dynamics. However, QREs 
might lose their stability when P is sufficiendy large. The difference 
between the SQRE and the NE depends on the competition 
between these two effects: approaching NE and losing stability. 
This difference provides a means of examining the applicability of 
the QRE. The QRE has been criticized because the fixed points, 
which are what we refer to as QREs, can always surpass mixed the 
NE as a result of the free parameter p. We agree with this 
statement. Therefore, that experimental data are closer to the 
QREs than the mixed NEs does not imply that the QRE is a better 
solution concept than the NE. However, this criticism does not 
hold for our SQRE, which loses its stability for larger P, indicating 
that our SQRE cannot always outperform mixed NEs. By 
assessing whether the experimental data points are closer to our 
SQRE than the mixed NEs, we can compare the SQRE solution 
concept with the NE. 

Difference between QRSS and SQRE 

QRSS, which is occasionally referred to as Logit response 
dynamics [14,27], starts from an arbitrary strategy profile for each 
player and then uses the transition matrix defined in Eq. (22) to 
evolve the strategic states of all of the players into the invariant 
distribution defined in Eq. (27). Its stationary states have been 
discussed by several researchers. However, there is no fuU study on 
the necessary and sufficient conditions of the convergence of the 
QRSS to NE [14,24,27]. In this section, we demonstrate that 
although Eq. (22) appears similar to Eq. (21), as defined in Eq. 
(27) differs substantially from P-jiXP)- As discussed below, the 
difierence is that P„ is a distribution in A (5) which is the set of all 
of the possible distribution functions on S, whereas PaoiP) is a 
member of A, which is the set of independent distribution 
functions. That is, P„ includes possibly correlated strategies, 
whereas p,^ (p) describes only purely non-cooperative strategies. 

Now, we demonstrate this statement using one example: a 2x2 
symmetric game with payoff matrices [14]: 



1,1 0,0 
0,0 1,1 



(50) 





Game 


AA 


AB 


BA 


BB 


NE ip\ p") 


1 


77 


35 


8 


48 


(0.4878, 0.1585) 


2 


73 


74 


87 


20 


(0.9853, 0.7941) 


3 


63 


8 


1 


17 


(0.2253, 0.1268) 


4 


55 


75 


73 


60 


(0.3939, 0.4545) 


5 


5 


64 


93 


40 


(0.4732, 0.2143) 


6 


46 


54 


61 


23 


(0.8261, 0.6739) 


7 


89 


53 


82 


92 


(0.2174, 0.8478) 


8 


88 


38 


40 


55 


(0.2308, 0.2615) 


9 


40 


76 


91 


23 


(0.6538, 0.5096) 


10 


69 


5 


13 


33 


(0.2381, 0.3333) 
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This game is a potential game [14]. It has two strict NEs and a 
proper mixed NE. According to [14] and [27], the corresponding 
Logit response dynamics has an invariant distribution of strategy 
profiles that correspond to the potential maximizer and the proper 
mixed NE. Here, we show that the invariant distribution in fact 
does not correspond to the proper mixed NE. It involves 
correlations between players. First, we calculate the probability 
transition matrix M and then obtain the P„ of this game according 
Eq. (27). From the P„, we find the reduced strategy profile P' for 
player 1 and for player 2: 

P\s]) = Y^P,,{s],sl),P\sl) = Y^P,,{s],sl). (51) 

in I 
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Table 2. The proportion of A choices In 500 trials by each of 
the players. 



Game 






1 


0.591 


0.318 


2 


0.84 


0.36 


3 


0.583 


0.222 


4 


0.274 


0.502 


5 


0.378 


0.32 


6 


0.638 


0.41 


7 


0.295 


0.522 


8 


0.4 


0.226 


9 


0.562 


0.449 


10 


0.32 


0.202 



doi:l 0.1 371 /journal.pone.01 05391 .t002 



Then, we define the correlation index as 
C=Y,[Pss{s],sl)-P\s])P\ 

l,m 



(52) 



The correlation index is zero when the joint distribution P„ is a 
product of two independent probability distributions. Otherwise, it 
is nonzero and vice versa. In Fig. 6, we plot the correlation index 
C calculated for the QRSS. We find that the correlation index of 
the QRSS is always non-zero except in cases of extremely small /? 
values. It is obvious that the SQRE is uncorrelated and that the 
correlation index remains 0. In fact, the joint distribution is 
defined for the SQRE in this manner. 

In this paper, we have not yet discussed the applicability of 
correlated strategies in non-cooperative games [19] and wiU not 
address it. However, we have noted that the invariant distribution 
of the QRSS or Logit response dynamics generally result in 
correlated strategies and the QRSS differs from the QRE and the 
SQRE. 
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Figure 7. NE, QRE, SQRE and experimental data of game 3: tKie average strategy profile is in the region of the SQRE of this game 
and relatively distant from the NE. Five games out of the ten games exhibit a similar behavior. In the Inset is the payoff matrix of this game. 
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Figure 8. NE, QRE, SQRE and experimental data of game 8: the average strategy profile is in the region of unstable solutions but 
remains closer to the SQRE than to the NE. Three games out of the ten games exhibit a similar behavior. In the inset is the payoff matrix of this 
game. 
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Experimental results re-analysed using ILQRD 

As previously explained, a key difference between this 
manuscript and other studies on the QRE is that beyond certain 
values of /}, QRE becomes unstable. Therefore, even when the 
experimental data are better fit by the QRE, if the data is near the 
unstable region, then applicability of the QRE is questionable. In 
this section, we examine how close the experimental data are to 
SQRE. We focus on experiments that involve 2x2 games with 
unique mixed NE and require the games to have a SQRE 
relatively distant from such mixed NE, i.e., games such as the one 
in Fig. 4. Erev et al. [35] conducted forty 2x2 constant sum 
games. Each pair of players played one game 500 times. Among 
these experimental games, there were ten games in which each 
game was played by nine pairs of subjects, whereas the other thirty 
games were played by one pair of subjects. Here, we use only 
experimental data from the ten games played by nine subject pairs. 
The payolf matrices of the ten games are shown in Table 1 , which 
is reproduced from Table 1 in [35]. The sixth column shows the 
NE of each game. Each player was asked to choose between A and 
B. The payoff entry AB presents player I's wining probability 
(xlOO) when the player choose A and the player's opponent 



choose B and so on. The payoff for each win was 4 cents. All ten 
games were played by fixed pairs for 500 trials. Table 2 shows the 
proportion of A choices in the 500 trials by each player. We would 
have preferred the data from the 500 pairs of independent players 
to the data from the games repeated 500 times by the same pair of 
players. However, first, we did not find such data, and second, for 
this constant-sum game, it is believed that a repeated game 
produces no surprising result. For social dilemma games, such as 
the prisoner's dilemma, of which when the game is repeated even 
a finite number of times, the experimental behavioral outcome 
completely changes. 

According to the payoff" matrices in Table 1, we obtain the 
SQRE and the QRE often games and compare the experimental 
data pouits with our SQ_RE. There are five games (games 2, 3, 6, 
7, 9) for which the experimental data points are in the area of the 
SQRE. We show one of them in Fig. 7 as an example. The 
experimental data from three games (games 1, 5, 8) are in the area 
of unstable solutions but remain closer to our SQRE than to the 
NE(Fig. 8). As shown in Fig. 9, the experimental data points of the 
remaining two games (games4, 1 0) are closer to the NE than to the 
SQRE. From these simple and limited comparisons, we conclude 
that the SQRE fits the observed behavior in real experiments 
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Figure 9. NE, QRE, SQRE and experimental data of game 10: the average strategy profile is closer to the NE than to the SQRE. Two 

games out of the ten games exhibit a similar behavior. In the inset is the payoff matrix of this game. 
doi:1 0.1 371 /journal.pone.01 05391 .g009 



better than the NE (5+3:2 in favor of the former in this limited 
comparison). However, further examination of the relation 
between the experimental data and our SQRE is required to 
arrive at a more definitive answer. We do not yet have any 
qualitative or quantitative criteria of games whose expected 
experimental behavior is close to the SQRE. This question wiU be 
examined in future investigations. 

Conclusion and Discussion 

In this manuscript, using the Logit quantal response function 
form (the Boltzmann distribution in statistical physics) to link the 
choice of strategy to the corresponding payoff in every step, we 
construct an iterative Logit quantal response dynamic process. 
Thus, the manuscript can be regarded as a dynamic version of 
Logit quantal response equilibrium. Importantly, our dynamic 
process differs from the so-called Logit response dynamics, which 
generally results in correlated equilibrium, even for non-cooper- 
ative games. 

It has been shown in [21] that the QRE exists for all of the 
values of /? - a measure of level of players' payoff sensitivity - and 
converges toward NEs when >oo. It has also been demonstrated 
on some examples and been taken for sure by some researchers 



that in fitting experimental data, the QRE is generally better than 
the NE because it is free to change the value of ;8 to improve the 
fitting [23,36,37]. In our manuscript, we demonstrate that this is 
not the case: When taking stability into consideration, in principle, 
the QRE is no longer always better than the NE. Based on the 
dynamic process, stable and unstable QREs are distinguished. We 
find the following: (1) For games with a single focal pure NE, there 
is always one stable QRE that converges toward the preferred NE 
when /?— >oo. (2) For games without any focal pure NEs but with 
one unique proper mixed NE, when the payoff sensitivity P is 
sufficiently large {fi>Pc), the QREs lose their stability and become 
unstable. For certain games, the QREs are already close to their 
corresponding NEs before they lose their stability. Therefore, the 
difference between stable QREs and NEs is small. For other 
games, the difference between stable QREs and NEs is substan- 
tially more pronounced. 

The latter case could be used to assess the applicabihty of the 
QRE to experiments and real-life observations. Then, we 
compared the stable and unstable QREs with experimental data. 
We found that the experimental observation of certain games (5 
games from our preliminary tests) yields results within the regions 
of the stable QRE, that in other games (3 games), the experimental 
data are located in the unstable regions but remain closer to stable 
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QREs than to the mixed NEs and that for other games (2 games) 
the experimental results are closer to the mixed NEs than to the 
stable QREs. We also believe that linking mixed NEs to mixed 
SQREs improves our understanding of applicability of mixed NEs. 

We have not identified any qualitative or quantitative criteria 
with which to classify games from this perspective. Further 
experimental and theoretical investigations are required to reach 
such a conclusion. In section, we only present a proof of the main 
observed features of our dynamic process for symmetric 2x2 
games. In the future, a general discussion of the features and their 
proof for NxM games should be undertaken and cross-game 
experiments when performed and compared against our SQREs 
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